Skip to content

Declarative API Explainer#76

Open
domfarolino wants to merge 7 commits intomainfrom
declarative-api-explainer
Open

Declarative API Explainer#76
domfarolino wants to merge 7 commits intomainfrom
declarative-api-explainer

Conversation

@domfarolino
Copy link
Collaborator

@domfarolino domfarolino commented Feb 5, 2026

This is a follow-up of #26 to address #22. Note that #26 has been superseded by this PR, per #26 (comment).

@domfarolino domfarolino force-pushed the declarative-api-explainer branch from b03b258 to f8a6998 Compare February 5, 2026 16:04
@maerzhase
Copy link

The proposal looks great! its exactly what i have in mind to support as additional tools via ai11y.

@domfarolino domfarolino marked this pull request as ready for review February 19, 2026 15:56
@domfarolino
Copy link
Collaborator Author

domfarolino commented Feb 19, 2026

@bwalderman, @mfreed7, and @bvandersloot, could you take a look?

Copy link

@bvandersloot-mozilla bvandersloot-mozilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+

creates a new declarative WebMCP tool whose input schema is generated according to
[Input schema synthesis](#input-schema-synthesis).

We also introduce the new `toolparamname` and `toolparamdescription` attributes, which apply to form
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chromium currently uses toolparamtitle, not toolparamname


```html
<form toolname="search-cars" tooldescription="Perform a car make/model search" [...]>
<input type=text toolparamname="make" toolparamdescription="The vehicle's make (i.e., BMW, Ford)" required>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI Here's what Chromium currently have:

  • The optional toolparamtitle attribute maps to the json-schema property key. If omitted, the browser defaults to the input element’s name.
  • The optional toolparamdescription attribute maps to the property description within the json-schema. In its absence, the browser uses the textContent of the associated HTML element but skips descendants that are labelable or, if no label exists, the aria-description.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI Here's what Chromium currently have:

  • The optional toolparamtitle attribute maps to the json-schema property key. If omitted, the browser defaults to the input element’s name.

We don't fall back to the name, we just omit the title field in that case.

I think we probably don't need the toolparamtitle attribute at all. It's not currently part of the proposal, and I suggest keeping it that way. (We will still want to synthesize the title field in some cases, but we can do that based on <label>, etc.)

Copy link
Collaborator Author

@domfarolino domfarolino Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not currently part of the proposal, and I suggest keeping it that way.

Does this mean Chromium is getting rid of the toolparamtitle attribute? Otherwise, we certainly don't want Chrome to ship it, but the spec exclude it, right?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@domfarolino Indeed, we wouldn't ship that attribute if it doesn't end up in the spec. The prototype in Chromium supports it right now only because it was part of the original/early proposal.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but since you're saying we shouldn't include it here, I'm wondering if you're recommending that because Chrome plans to rip it out of our implementation? Or do you think Chrome will keep it, and we should eventually put it in our spec proposal?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to rip it out, because there is something to be said for keeping this API as "low footprint" as possible. I understand that the author needs an "escape hatch" when <label> / ARIA is still insufficient to lead the agent on the right path (toolparamdescription). However, I don't think I see a need for two such escape hatches? HTML forms should already be reasonably semantically equipped to provide a meaningful title for things via <label> (etc).

<input ... toolparamtitle="foo" toolparamdescription="bar">

would synthesize to two free-text, author-provided fields right next to each other:

{
    // ...
    "title":"foo",
    "description":"bar"
}

If "description":"bar" alone is not enough to guide the agent, then it seems to me like the author might just as well send "description":"Description that covers both foo and bar", instead of having to worry about what passes as a "title" and what passes as a "description".

Chrome plans

This is just my opinion right now, so I'm not sure if that amounts to a "Chrome plan" just yet. We would presumably follow whatever the group decision is. Until then we should be free to suggest a "default" in our proposal, and I'm suggesting that we omit it until someone asks for it. 🙂

When a form element performs a navigation, the first `<script type=application/ld+json>` tag on the
target page is used as the cross-document tool's "response" that gets sent to the model.

When no such a tag is present, probably we'll decide that the page's entire contents is sent to the
Copy link

@MiguelsPizza MiguelsPizza Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on defaulting to a simple "form submitted successfully" message as the tool response? The signal to noise ratio of raw HTML can be huge and I'd argue it's better not to send it at all rather than send all of it.

"Accurate semantic representation" and "useful to the model" aren't the same thing. A content-heavy result page could easily blow out context in a way that's hard to predict or debug.

3. Two ways of getting a form response back to the agent that invoked the form tool:
1. `SubmitEvent#respondWith()`, which lets JavaScript on the page override the default form
action, and pipe a response back to the agent without navigating the page.
2. Extracting `<script type="application/json-ld">` tags on the page that the form navigated to,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some time to try out the JSON-LD cross document ergonomics this week and I'm leaning towards declarative tools not concerning themselves with cross-document outputs at all. I'd be curious to hear others' thoughts on this. The cross-document concern feels a bit self-inflicted and I found myself bypassing it in favor of read tools on the navigated-to page that the model can call on follow ups.

The model just needs to know: (a) did I fill the form correctly, if not what are the validation errors? (this does not cause a navigation) and (b) did it submit successfully
(this could just be a generic message). From there it can follow up with read tools on the destination page to verify context.

This would let us drop concerns around output schema for declarative cross document tools as well.

+1 to @bwalderman's <context> idea. That or something like it feels like the right primitive here rather than a cross-document output API bolted onto forms. This sort of thing would be useful in general rather than as a specific solution for form submissions
that cause navigations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also have concerns about <script type="application/json-ld">. The <script> tag isn't exactly intended for this purpose to begin with, and it's being used only to solve for cross-document form submission. That's just one specific case of a more general problem: What happens when an agent needs to initialize or catch-up state, either because it started browsing a new page, or it used a tool (imperative or declarative) that navigated. Read tools are important for the agent to get this initial state after navigation, so I'm leaning towards figuring out the declarative approach for read tools, and then letting this be the solution for cross-document form submission. This is the reason why I brought the <context> tag from the VOIX framework to everyone's attention. We'll want something like it anyway for the declarative approach to be considered complete, and it also happens to be a reasonable solution for cross-document form submission.

It may not end up being called <context>. It would probably be <resource> to keep aligned with MCP, but the semantics would be the same.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also found the idea of vacuuming up "circumstantial" JSON-LD post-navigation and declaring it a "result" to a specific tool execution very questionable.

I'm leaning towards figuring out the declarative approach for read tools, and then letting this be the solution for cross-document form submission.

@bwalderman In that world, the result of a declarative tool execution would indeed be something like "form submitted successfully" like @MiguelsPizza says? And then other side-effects of the tool execution would be understood by the agent from "read tools" it would anyway execute after any navigation?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andruud that is my interpretation as well. Rather than try to work around loss of state due to navigation, accept this as a limitation and let a simple "form success" message be a cue to the agent that it should pull new state.

};
```

**`toolactivated` and `toolcanceled` events
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when is the toolcanceled event dispatched?

@anssiko anssiko removed the Agenda+ label Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants